Instructions

Below you will find several empty R code scripts and few places where a line starts with the word “Answer:”. Your task is to fill in the required code and answer the questions as stated.

Eggs Dataset

Today you will be working with a datasets of birds:

Here is a full data dictionary describing all of the variables

Notice that the last two variables are integer codes. They are stored as numbers but correspond to a category.

Starting plot

Create a scatter plot showing the mass of a male bird (x-axis) and the mass of an egg:

You should notice that the plot’s scale makes it hard to see the relationship between the two variables.

Changing the scale

Now add the layers scale_x_log10 and scale_y_log10

How would you now describe the relationship between the two variables (I just need one sentence here)?

Answer: The larger the mass of the male, then the larger the mass of his egg.

Parrots

Create a new dataset called parrots consisting of just those birds that are parrots (hint: use the type variable; double hint: look at the raw data for exactly how to format the filter query):

Now add a layer to the previous plot (keeping the log scales) where the parrots are highlighted in the color “red”. To make them stand out, make the base layer have an alpha value of 0.15. Finally, add a text annotation describing to the reader that the red points are parrots.

Smoothing line

Now, we are going to add a best-fit line to the plot. We do this by adding geom_smooth(method = "lm") to the plot. Add this to the plot using the log-log scale, but without highlighting the parrots.

I think the best-fit is a bit to colorful and noisy. Fix it by changing the line to this instead: geom_smooth(method = "lm", color = "black", se = FALSE, linetype = "dashed", size = 0.5).

Does the best-fit match the visual pattern you saw between the size of a bird and the size of its eggs (again, one sentence is sufficent)?

Answer: Yes, egg size is dependent on the size of the male.

Outliers

If you look at the plot, you’ll see one bird in particular who has a very large egg size given the mass of the bird itself. This is the the Red-tailed tropicbird (also, you can add pictures to Rmarkdown!):

The tropicbird as a male mass of 218.7g and an egg mass of 87.00g. Annotate this point on the graph and give a label for it:

Your turn

Construct one final graph of the data. You are free to use the other variables that we did not look at yet or to look at different classes of birds. For this graph (only), please add an appropriate title and annotations.